9 research outputs found

    Translation inference through multi-lingual word embedding similarity

    Get PDF
    This paper describes our contribution to the Shared Task on Translation Inference across Dictionaries (TIAD-2019). In our approach, we construct a multi-lingual word embedding space by projecting new languages in the feature space of a language for which a pretrained embedding model exists. We use the similarity of the word embeddings to predict candidate translations. Even if our projection methodology is rather simplistic, our system outperforms the other participating systems with respect to the F1 measure for the language pairs which we predicted

    Intelligente Schleusenzulaufsteuerung zur Effizienzsteigerung des Binnenschiffsverkehrs

    Get PDF
    Ausführungen zum Poster, siehe Posterbeitrag: https://hdl.handle.net/20.500.11970/11044

    Intelligente Schleusenzulaufsteuerung zur Effizienzsteigerung des Binnenschiffsverkehrs

    Get PDF
    Posterbeitrag bezieht sich auf Poster, siehe https://hdl.handle.net/20.500.11970/11044

    Towards LLOD-based language contact studies: a case study in interoperability

    Get PDF
    We describe a methodological and technical framework for conducting qualitative and quantitative studies of linguistic research questions over diverse and heterogeneous data sources such as corpora and elicitations. We demonstrate how LLOD formalisms can be employed to develop extraction pipelines for features and linguistic examples from corpora and collections of interlinear glossed text, and furthermore, how SPARQL UPDATE can be employed (1) to normalize diverse data against a reference data model (here, POWLA), (2) to harmonize annotation vocabularies by reference to terminology repositories (here, OLiA), (3) to extract examples from these normalized data structures regardless of their origin, and (4) to implement this extraction routine in a tool-independent manner for different languages with different annotation schemes. We demonstrate our approach for language contact studies for genetically unrelated, but neighboring languages from the Caucasus area, Eastern Armenian and Georgian

    Universal morphologies for the Caucasus region

    Get PDF
    The Caucasus region is famed for its rich and diverse arrays of languages and language families, often challenging European-centered views established in traditional linguistics. In this paper, we describe ongoing efforts to improve the coverage of Universal Morphologies for languages of the Caucasus region. The Universal Morphologies (UniMorph) are a recent community project aiming to complement the Universal Dependencies which focus on morphosyntax and syntax. We describe the development of UniMorph resources for Nakh-Daghestanian and Kartvelian languages as a well as for Classical Armenian, we discuss challenges that the complex morphology of these and related languages poses to the current design of UniMorph, and suggest possibilities to improve the applicability of UniMorph for languages of the Caucasus region in particular and for low resource languages in general. We also criticize the UniMorph TSV format for its limited expressiveness, and suggest to complement the existing UniMorph workflow with support for additional source formats on grounds of Linked Open Data technology

    Using machine learning for translation inference across dictionaries

    No full text
    This paper describes our contribution to the closed track of the Shared Task Translation Inference across Dictionaries (TIAD2017), 1 held in conjunction with the first Conference on Language Data and Knowledge (LDK-2017). In our approach, we use supervised machine learning to predict high-quality candidate translation pairs. We train a Support Vector Machine using several features, mostly of the translation graph, but also taking into consideration string similarity (Levenshtein distance). As the closed track does not provide manual training data, we define positive training examples as translation candidate pairs which occur in a cycle in which there is a direct connection
    corecore